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Abstract 

We discuss non-Gaussian random matrices whose elements are random variables 
with heavy-tailed probability distributions. In probability theory heavy tails of 
the distributions describe rare but violent events which usually have dominant 
influence on the statistics. They also completely change universal properties 
of eigenvalues and eigenvectors of random matrices. We concentrate here on 
the universal macroscopic properties of (1) Wigner matrices belonging to the 
Levy basin of attraction, (2) matrices representing stable free random variables 
and (3) a class of heavy-tailed matrices obtained by parametric deformations 
of standard ensembles. 

13.1 Introduction 

Gaussian random matrices have been studied over many decades and are well 
known by now [Meh04]. Much less is known about matrices whose elements 
display strong fluctuations described by probability distributions with heavy 
tails. 

Probably the simplest example of a matrix from this class is a real symmetric 
(Aij = Aji) random matrix An with elements A^, 1 < i < j < N, being 
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independent identically distributed (i.i.d.) centered real random variables with 
a probability density function (p.d.f.) falling off as a power 

p(x) ~ |xr Q_1 . (13.1.1) 

for | a; | — ► oo. The smaller is the value a the heavier is the tail. Higher moments 
of this distribution do not exist. For a E (0, 1] the tail is extremely heavy 
and the mean-value does not exist since the corresponding integral f xp(x)dx 
is divergent. For a G (1,2] the mean- value does exist but the variance does not. 
The influence of heavy tails on the statistical properties of a random matrix is 
enormous. It is particularly apparent for a < 1. In this case the elements Aij 
assume values scattered over a wide range which itself quickly increases when 
N goes to infinity. The largest element |a max | of the matrix is of the order 
|flmax| ~ iV 2 / a and its value strongly fluctuates from matrix to matrix. The 
distribution of the normalized value of the maximal element x = |o m aa; 
is given by a Frechet distribution which itself has a heavy-tail. The largest 
element of the matrix may be larger than the sum of all remaining ones. The 
values of the elements change so dramatically from matrix to matrix that one 
cannot speak about a typical matrix or about self-averaging for large N. In 
the limit N — > oo matrices A n may look effectively very spars^]. Indeed if one 
considers a rescaled matrix ^4Ar/|a max | one will find that only a finite fraction of 
all elements of this matrix will be significantly different from zero. This effective 
sparseness is quantified by the inverse participation ratio Yi constructed from 
normalized weights Wy = |cty|/ Ylij \ a ij\i which sum up to unity Ylij w ij = 1> 



Y, 



i<3 



i)E«4 • ( 13 - L2 ) 



The bar denotes the average over matrices An. In the limit N — > oo the 
participation ratio is Y% = 1 — a > for a £ (0, 1) [Bou97]. This means that 
only a finite fraction of matrix elements is relevant in a given realization of 
the matrix. This is a completely different behavior than the one known from 
considerations of Gaussian random matrices. For a < [1,2), although Y2 = 
in the limit N — > 00, one still observes very large fluctuations of individual 
matrix elements which in particular lead to a localization of eigenvectors that 
will be shortly discussed towards the end of the next section. Only for a > 2 
the behavior of matrices resemble^! that known for Gaussian matrices. In this 
case, the variance a 2 of (|13.1.ip is finite and the eigenvalue density of the 
matrix A^/y/N converges for N — > 00 to the Wigner semicircle law p(A) = 



1 Sparse random matrices are discussed in Chapter 23 of this book. 

2 The convergence to the limiting semicircle law is generically very slow in the presence of 
power-law tails and moreover for a < 4 microscopic properties are significantly different than 
for generic Gaussian matrices as we will shortly mention later. 
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\J 4a 2 — A 2 / (27r<7 2 ), independently of the details of the probability distribution. 
Random matrices having the same limiting eigenvalue density for N — > oo are 
said to belong to the same macroscopic universality class. For a > 2 it is 
called a Gaussian universality. This class is very broad and comprises a whole 
variety of random matrices. In particular one can prove [Pas72] that if An is 
a symmetric random matrix with independent (but not necessarily identically 
distributed) centered entries with the same variance a 2 , then the condition for 
the eigenvalue distribution of An/ \/iV to converge to the Wigner semicircle law 
is 

lim -r^y^ x 2 pij(x) dx = (13.1.3) 

where e is any positive number and Pij(x) is the p.d.f. for the ij'-th element 
of the matrix. As a matter of fact this condition is almost identical as the 
Lindeberg condition known from the central limit theorem for the distribution 
of a sum of random numbers to converge to a normal distribution [Fel71j . The 
fastest convergence to the limiting semicircle law is achieved for matrices whose 
elements are independent Gaussian random variables. A prominent place in this 
macroscopic universality class is taken by the ensemble of symmetric Gaussian 
matrices whose diagonal elements have a twice bigger variance than the off- 
diagonal ones M(0,a 2 (l + <%))• Clearly, such matrices fulfill the Linderberg 
condition. The probability measure in the ensemble of such matrices can be 
written as 

du(A) = DAexp ^tr^ 2 (13.1.4) 

2a z 

where DA is a flat measure DA = Yli<i<j<N dAij. The measure dfj,(A) is man- 
ifestly invariant with respect to the orthogonal transformations: A — ► OAO T ', 
where O is an orthogonal matrix. This is the GOE ensemble. In the limit 
N — > oo the eigenvalue density of An/^/N approaches a well-known semicir- 
cle distribution. In a similar way one can construct GUE and GSE ensembles 
which are extensively discussed in other chapters of the book. In this chapter 
we will be mostly interested in matrices for which the variance does not exist. 



13.2 Wigner-Levy matrices 

In this section we will discuss properties of heavy tailed symmetric matrices 
with i.i.d. elements (I13.1.ip for a £ (0, 2). We call them Wigner-Levy matrices 
since the Levy distribution is the corresponding stable law for < a < 2 which 
plays an analogous role from the point of view of the central limit theorem 
as the Gaussian distribution for a > 2 [Gnc68j. The Levy distributions are 
sometimes called a-stable laws. 

Before we discuss Wigner-Levy matrices let us briefly recall basic facts about 
Levy distributions. In addition to a stability index a these laws are character- 
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ized by an asymmetry parameter [5 G [—1,1], which will be discussed below, 
and a scale parameter R > 0, called the range, which plays a similar role as 
the standard deviation a for the normal law. The statement that a Levy dis- 
tribution is a stable law (with respect to addition) means that a sum of two 
independent Levy random variables x = x\ + X2 with a given index a is again 
a Levy random variable with the same a. The range R and asymmetry (3 of 
the resulting distribution can be calculated as 

R<* = R? + R%, ^ = M+M, (13.2.1) 

For Pi = p2 the asymmetry is preserved and the relation for the effective range 
is a generalization of the corresponding one for independent Gaussian random 
variables, where the sum is also a Gaussian random variable with the variance 
a 2 = a\ + cr|. Actually for a = 2 the range and the standard deviation are 
related as a = \p2R. 

The p.d.f. La'^(x) of the Levy distribution with the stability index a, the 
asymmetry P and the range R is conventionally written as a Fourier transform 
of the characteristic function, since its form is known explicitly. There are 
several definitions used in the literature, here we quote one, which seems to be 
the most common [Gne681 NollO^ 

L^(x) = — / dkL^(k)e- lkx (13.2.2) 

^ J-oo 

wher^§ 

c(k) = Inly (k) = -R a \k\ a (l + iPsgn(k)t&n{ira/2)) . (13.2.3) 

The logarithm of the characteristic function c(k) is usually called a cumulant- 
generating function. This name is slightly misleading since for Levy distri- 
butions only the first cumulant exists for a £ (1,2) or none for a £ (0,1]. 
Therefore we will rather call it the c-transform. The characteristic function is 
known explicitly in contrast to the corresponding p.d.f. L a (x) which can be 
expressed in terms of simple functions only for a = 1, P = (Cauchy distribu- 
tion) a = 3/2, P = ±1 (Smirnoff distribution). For a = 2, the characteristic 
function (|13.2.3p becomes a Gaussian function independent of p. A stability 
of the Levy distribution can be easily verified. A p.d.f. for the sum of two 
independent variables X1+2 = xi +X2 is a convolution of the p.d.f.'s for individ- 
ual components xi and X2 and thus the corresponding characteristic function 
is a product of two characteristic functions. It is easy to check by inspection 
of (|13.2.3p that the relations (|13.2.ip are indeed satisfied. It is less trivial to 
demonstrate that the Fourier transform of the characteristic function (|13.2.2p 



For a = 1 it assumes a slightly form: c(k) — —R\k\ (1 + i(2/3/7r)sgn(fc) ln(Rk)). 
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is a non-negative function. Actually this is only the case for < a < 2. Levy 
distributions (x) for a G (0,2] are the only stable laws. The asymmetry 
parameter [3 controls the skewness of the distribution. For [3 = 0, the charac- 
teristic function (113. 2.3ft is even and so is the p.d.f. La ,0 (x) = La'°(—x). For 
0^=0 the p.d.f. is skew and has a different left and right asymptotic behavior 

L y (xr ^±oo {1±p) ^_ (1324) 

where j a = T(l + a) sin(-7ro;/2)/7r. In the extreme cases (3 = ±1 one of the tails 
is suppressed and for a < 1 the distribution becomes fully asymmetric with a 
support only on the positive (resp. negative) semiaxis. One should note that the 
mean value of the Levy distribution for a > 1 equals zero, independently of the 
asymmetry (3. For a = 2 the dependence on (3 disappears. If one sets R = 1 in 
(I13.2.3h one obtains a standardized Levy distribution. A Levy random variable 
x with an arbitrary R can be obtained from the corresponding standardized 
one by a rescaling x = Rx*, hence La ,l3 {x) = L 1 ^ (x / R) / R. 

Let us now consider a symmetric matrix An with elements Ay for 1 < i < 
j < N being i.i.d. Levy random variables with the p.d.f. La'^{x). We are now 
interested in the eigenvalue density of such a matrix in the limit N — > oo. As 
we shall see below the eigenvalue density of the matrix A^/N 1 ^ converges to 
a limiting density p(X) which is completely different from the Wigner semicir- 
cle law and has an infinite support and heavy tails. The choice of the scaling 
factor N l / a is related to the universal scaling properties of the Levy distribu- 
tion and for this reason can be viewed as a generalization of the scaling for 
Gaussian random matrices (a = 2). The problem of the determination of the 
limiting eigenvalue distribution of Wigner-Levy matrices is not simple because 
the standard methods used for Gaussian matrices or matrices from invariant 
ensembles do not apply here. A special method tailored to the specific univer- 
sal features of heavy-tailed distributions was necessary to attack this problem. 
Such a method, being a beautiful adaptation of the cavity method which max- 
imally exploits universal properties of a-stable distributions, was invented in 
|Ciz94| . Thedescript ion of the method is beyond the scope of this chapter and 
we refer the interested reader to the original paper [Ciz94j and to the paper 
[Bur07] where the derivation was explained step by step and some details were 
corrected. Here we only quote the final result. The eigenvalue density p(X) of 
a Wigner-Levy matrix is given by a Levy function with an index a/2, being a 
half of the index a of the p.d.f. La^(x) used to generate the matrix elements, 
and an effective "running" range R(X) and asymmetry parameter /3(A) 

P^ = Tp(t) where m = L^ m (X) (13.2.5) 
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and 

T(l + a)cos(vra/4)^ 1/a 



A - R { r ( i-w2) - ■ <13 ' 2 - 6) 

The scale parameter A is proportional to the original range R. The functions 
R(X), /3(A) satisfy a set of integral equations 

/+oo ^ — 

dx\x\-%LffiK x \\-x) (13.2.7) 
-oo 

- f + ^° dx sign(x)|x| _ ^L i? ^' ) ' /3( ' :r ' ) (A — x) 

/3(A) = ^ 1 — . °/ 2 (13.2.8) 



f +oc bl-fL^f^fA 

J— oo I I a/2 



X) 



where the integrals should be interpreted as Cauchy principal values. The 
equation (jl3.2.8j) . which was derived in [Bur07| . is a corrected version of this 
equation given in [C"iz94j . 

We conclude the section with some comments. 

The dependence on A on the right hand side of (|13.2.5j) appears not only 
through the main argument of the function but also through the dependence of 
the effective parameters R{\) and /3(A) on A. This makes the resulting expres- 
sion very complex. The equations for /3(A) and R(X) cannot be solved analyt- 
ically. It is also not easy to solve them numerically because the computation 
of the function (x), which is the main building block of the above integral 
equations, is numerically unstable if one does the Fourier integral (|13.2.2p in 
a straightforward way since its integrand is a strongly oscillating function. It 
is necessary to apply a non-trivial transformation of the integration contour 
in the complex plane to assure that the integration becomes numerically sta- 
ble [NollO] . We skip the details here and again refer the interested reader to 
[Bur07] for details. 

The eigenvalue density (|13.2.5p holds for any a G (1,2) and is independent 
of the asymmetry of the p.d.f. for matrix elements. The independence of [3 
might be surprising but it can be easily checked by inspection. Indeed, equations 
(|13.2.7p and (113.2.8|) depend only on a and do not involve any dependence on 
the parameters /3 or R of the p.d.f. for matrix elements, La ,/3 (x) . Numerically 
the solution (I13.2.5P seems to extend also to the range of a £ (0, 1] but for 
symmetric distributions (/3 = 0) only. 

The limiting eigenvalue density p(X) is an unimodal even function. The 
height of the maximum at A = is 

, , r(l + 2/a) /r 2 (l + a/2) 

For large |A| — > oo the eigenvalue density has heavy tails 

p(X) ~ ~r(l + a) sin (^) (13.2.10) 




Figure 13.1: The eigenvalue density for infinite Wigner-Levy matrices with 
the index a = 1.0,1.25,1.5,1.95 and the range R = 1 (solid line) and the 
corresponding numerical densities obtained by Monte-Carlo generated matrices 
of size 200 x 200 with entries distributed according to Ld (x). 



with the same power as the p.d.f. for the matrix elements, although at first 
sight one might have the impression that the power should rather be a/2 for 
it is the index of the Levy function on the right hand side of (|13.2.5p . This is 
not the case because also the dependence of the running parameters -R(A) and 
/3(A) on A contributes to a net asymptotic behavior. In Figure [13.11 we show 
limiting distributions p(X) of the Wigher-Levy matrices with R = 1 and different 
values of a and compare them to eigenvalue histograms obtained numerically 
by Monte-Carlo generation of symmetric 200 x 200 matrices with i.i.d. entries 
with the p.d.f. (x). The agreement is very good and finite size effects are 
small. 

Another feature of Wigner-Levy matrices which distinguishes them from 
Wigner matrices from the Gaussian universality class is a localization of the 
eigenvectors [Ciz94j. The degree of localization is measured by the inverse 
participation ratio 

N 

y 2 = 5>* 4 (13.2.H) 

i=l 

calculated for the elements ^'s of a normalized eigenvector Yli=i "0? = 1- For 
Wigner matrices from the Gaussian universality class yi = in the limit N — ► 
oo and it is independent of the corresponding eigenvalue. For Wigner-Levy 
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matrices with the index a E (1,2), the average inverse participation ratio 2/2 = 
i/2 (A) depends on the eigenvalue A. For A smaller than a certain critical value, 
A < A cr , 2/2 (A) = 0, however for A > \ cr above this value 2/2 (A) is positive and 
it grows monotonically to one when A increases [Ciz94] . This means that only 
a finite number of elements ipi are significantly different from zero or phrasing 
it differently that the vector is localized on a subset of cardinality of the order 
I/2/2 (A). When A goes to infinity 2/2 (A) tends to one. This means that for 
extremely large eigenvalues all but one elements of the corresponding vector 
are equal zero and the eigenvector is localized on exactly one state. The critical 
value A cr depends on a and grows monotonically from zero for a = 1 to infinity 
for a = 2. The fact that X cr = for a = 1 means that in this case (and also 
for a < 1) all eigenvectors are localized. 

The eigenvalue density (113.2.5P defines a whole universality class of Wigner- 
Levy matrices. It is a counterpart of the Wigner semicircle law. The eigenvalue 
density of a matrix An /N l l a with i.i.d. entries with a p.d.f. p{x) will converge 
to the same limiting law if the function p(x) has the same asymptotic behavior 
as La (x). More precisely, if the p.d.f. p(x) is centered (f^°° dxxp(x) = 0) for 
a £ (1, 2) (or even for a G (0, 1]) and has the following asymptotic behavior 

P( x ) ^ (13-2.12) 

then the eigenvalue distribution of the matrix A^/N 1 ^ converges to the same 
limiting density p(x) (|13.2.5|) as for the corresponding matrix with the p.d.f. 
L a (x) with the same index a and 

where as before 7 Q = T(l + a) sin(-7ra/2)/7r. One can probably extend this 
macroscopic universality to a class of matrices with independent but not neces- 
sarily identically distributed entries however having distributions with the same 
asymptotic behavior. 

We finish this section with a comment on microscopic properties of Wigner 
matrices with heavy tails (|13.1.ip . We have seen so far that macroscopic prop- 
erties of Wigner matrices change at a = 2. For a > 2 Wigner matrices belong 
to the Gaussian universality class and their eigenvalue density converges for 
large N to the Wigner semicircle law while for a < 2 they belong to the Levy 
universality class and their eigenvalue density converges to the limiting distri- 
bution given by (|13.2.5p . One may ask if a = 2 is also a critical value for 
microscopic properties like for instance eigenvalue correlations or the statis- 
tics of the largest eigenvalue X max . This question has already been partially 
studied. It was found that in this case the critical value of the exponent a 
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is rather a = 4 [Bir07aJ. For a > 4 the largest eigenvalue of the matrix 
An/v'N, for N — » oo, fluctuates around the upper edge of the support of the 
Wigner semicircle distribution and the fluctuations are of the order iV -2 / 3 . A 
rescaled quantity x = {\ ma x — ^edge)N 2 ^ 3 obeys the Tracy- Widom statistics 
[Tra94, Tra96j although the convergence to this limiting distribution is rather 
slow. For a < 4 the largest eigenvalue is of the order N^~ a ^^ 2a ^ and a rescaled 
quantity y = X max /N^~ a '^ 2a ' is distributed according to a modified Frechet 
law. For a = 4 which is a marginal case the two regimes are mixed in the pro- 
portions depending on details of the p.d.f. for matrix elements, in particular on 
the amplitude of the tail. Roughly speaking for a > 4 the eigenvalue repulsion 
shapes the microscopic properties of the matrix and leads to the Tracy- Widom 
statistics while for a < 4 the repulsion plays a secondary role. The dominant 
effect is in this case related to fluctuations in the tail of the distribution which 
are so large that the repulsion can be neglected and the largest eigenvalues can 
be treated as independent of each other. Actually it has been known for some 
time [Sos04| that indeed the largest eigenvalues are given by a Poisson point 
process with a Frechet intensity related to the statistics of the largest elements 
in the random matrix coming from the tail of the distribution (|13.1.ip for a < 2. 
In paper [Bir0 7a] an argument was given that basically the same picture holds 
also for 2 < a < 4. 

Wigner matrices are discussed in Chapter 21. The interested reader can 
find there a discussion also of other aspects of this class of random matrices. 

13.3 Free random variables and free Levy matrices 

It is sometimes convenient to think of whole matrices as entities and to formu- 
late for them probabilistic laws. One can ask for example if one can calculate 
the eigenvalue density /514.2(A) of a sum of two symmetric (or hermitian) N x N 
random matrices 

A 1+2 =A 1 +A 2 (13.3.1) 

given the densities pi(A) and /02(A) of A\ and A%. In general the answer to this 
question is negative since the resulting distribution depends on many other fac- 
tors. The situation becomes less hopeless for large matrices. It turns out that 
for N — > 00, /514.2(A) depends only on pi (A) and /92(A) if A\ and A2 are inde- 
pendent random matrices or saying more precisely if they are free. The freeness 
is a concept closely related to the independence. The independence itself is not 
sufficient. The freeness additionally requires a complete lack of angular corre- 
lations. Such correlations may appear even for independent matrices if they 
are generated from a matrix-ensemble which hides some characteristic angular 
pattern. An example is just the ensemble of Wigner-Levy matrices whose prob- 
ability measure is not rotationally invariant. This measure favors some specific 
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angular directions. In effect, even if one picks at random two matrices from 
this ensemble, they both will prefer some angular directions and thus will have 
some sort of mutual correlations. One can remove the correlations by a uniform 
angular randomization of the matrices A± and A2 

A la2 = O x A x 6[ + 2 A 2 0l (13.3.2) 

where Oj's are random orthogonal matrices with a uniform probability mea- 
sure in the group of orthogonal matricefl The matrix OiAiOj has exactly 
the same eigenvalue content as A{. Actually to achieve the effect of the an- 
gular decorrelation it is sufficient to rotate only one of A^s. This type of 
addition is called a free addition and we denote it by EEL Of course if A^s 
are generated from an ensemble with a rotationally invariant probability mea- 
sure, as for instance DAexp — tiV(A), then the two types of additions (|13.3.ip 
and (|13.3.2p are identical and pi +2 (X) = Pih2(A). Otherwise they are differ- 
ent and pi +2 (\) 7^ piffl2(A). For example a sum of independent identically 
distributed Wigner-Levy matrices (|13.3.ip is a Wigner-Levy matrix which is 
not rotationally invariant while a free sum of Wigner-Levy matrices is a rota- 
tionally invariant matrix, which has a different eigenvalue density, so in this 
case pi +2 (\) 7^ /?iB32(A). Generally, for large matrices the eigenvalue density 
PiES2(^) of a free sum depends only on pi(\) and p 2 (X) and it can be uniquely 
determined from them [Voi92]. 

A theory of free addition was actually developed in probability theory of 
non-commutative objects (operators) long before random matrices entered the 
scene JVoi85] . It was originally formulated in terms of von Neumann algebras 
equipped with a trace-like normal state r, which was introduced to general- 
ize the concept of uncorrelated variables known from a classical probability. 
More specifically in classical probability two real random variables x\ and x 2 
are uncorrelated if the correlation function, calculated as the expectation value 
E(xix 2 ) = for centered variables, Xi = Xi — E(xi), vanishes. In free proba- 
bility, by analogy, elements Xi are free if t(Xk(i)X%(2) • • -^(m)) = f° r an Y 
permutation 7r of the corresponding centered elements Xi = Xi — r(Xj) ■ 1. A 
link to large matrices was discovered later [Voi91] when it was realized that a 
non-commutative probability space with free elements AYs can be mapped onto 
a set of large N x N matrices, N — > 00, of the form Xi = UiDillj , where Di 
are diagonal real matrices and Ui random unitary (or orthogonal, Oj) matrices 
distributed with a uniform probability measure on the group. In this map- 
ping the state function r corresponds to a standard normalized trace operation 
t(. . .) <-> itr(. ..). This observation turned out to be very fruitful both for 
free probability and for theory of large random matrices since one could suc- 
cessfully apply results of a free probability to random matrices and vice versa 

4 For hermitian matrices Ai one uses unitary rotations Ai + 2 = UiAiUl + U2A2UI 
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[Voi92]. Free probability and its relation to random matrices and planar com- 
binatorics for the case when all moments of the probability distribution exist 
|Spe94 rNic06] is discussed in detail in Chapter 22. In this section we concen- 



trate on heavy-tailed free distributions for which higher moments do not exist. 
In particular we shall discuss free stable laws and their matrix realizations. Be- 
fore we do that we recall basic concepts to make the discussion of this section 
self-contained. 

Actually the law of addition of free random variables is in many respects 
analogous to the law of addition of independent real random variables. We will 
therefore begin by briefly recalling the law of addition in classical probability 
and by analogy describe the corresponding steps in free probability. The p.d.f. 
Pi+2(x) of a sum x = x\ + X2 of independent real random variables x\ and X2 
is given by a convolution of the individual p.d.f. 's p\ + 2{x) = (pi *p2){x). Thus, 
the characteristic function p(k) = f dke lkx p(x) for the sum is a product of 
the corresponding characteristic functions pi^ik) = Pi(k)p2(k). This is more 
conveniently expressed in terms of a cumulant-generating function (c-transform) 
defined as c(k) = In p(k), as a simple additive law 

c 1+2 (k) =ci(fc) + c 2 {k). (13.3.3) 

It turns out that one can find a corresponding object in free probability |Voi92| . 
a free cumulant-generating function alternatively called i?-transform, for which 
the free addition (|13.3.2 j) leads to a corresponding additive rule 

Rvsn(z) = R x (z)+R 2 {z) . (13.3.4) 

For a given eigenvalue density p{\) one defines a moment-generating function 
as a Cauchy transform of the eigenvalue density [Voi92j 

G(2)= pPWdx 

J-oo z — * 

known also as the resolvent or Green function. The free-cumulant-generating 
function (i?-transform) is related to the Green function as 

z = G(R{z) + -^j . (13.3.6) 

The last equation can be inverted for G(z) 

z = R(G(z)) + . (13.3.7) 

In short z — ► R(z) + 1/z is the inverse function of z — > G(z) and thus for a 
given resolvent one can determine the i?-transform and vice versa. Using the 
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addition law (|13.3.4|) one can now give a step-by-step algorithm to calculate 
the eigenvalue density pin2(A) of the free sum (|13.3.2p from pi(A) and /02(A). 
First one determines the resolvents G\{z) and G2(z) using (|13.3.5p . then the 
corresponding /^-transforms R\{z) and R2(z) using (|13.3.6p and finally R\m2{z) 
using the addition law (|13.3,4p . Having found R\bi{z) one proceeds in the 
opposite order. One reconstructs the corresponding resolvent G\ m 2{z) (|13.3.7p 
and then the density pim(X) by the inverse of (|13.3.5p 

p(X) = -— ImG(A + i0 + ) (13.3.8) 

7T 

which follows from the relation (x + + )~ 1 = P.V.x~ l — i'k5{x). This completes 
the task of calculating the eigenvalue density of a free sum /9i ffl 2(A) from pi(A) 
and /02(A). This procedure can be fully automatized and it actually has been 
implemented for a certain class of matrices |Rao06j . 

The correspondence between the laws of addition (|13.3.3p and (|13.3.4p and 
between the logical structures behind these laws in classical and free probability 
has also profound theoretical implications. One of them is a bijection between 
infinitely divisible laws of classical probability and the laws in free probability 
[Ber93|. Using this bijection one can derive the i?-transform for stable laws in 
free probability |Ber99| 

BW _f bz**- 1 for a G (0, 2] and a + 1 

~ I " (2/3 AO In z for ct = 1 {^-^) 

where b = _ e <«(l+/»W2 for a G (0,1) and b = e i(«-2)(l+0W2 for a G (1,2]. 
The stable /^-transforms (I13.3.9P are in one-to-one correspondence with the c- 
transforms of Levy distributions (|13.2.3p and they fully classify all free stable 
laws and allow one to determine the free probability densities for these stable 
laws. These densities are equal to the eigenvalue densities of free Levy matrices 
being matrix realizations of the stable free random variables. 

Let us illustrate how this procedure works for a stable law with the stability 
index a = 2 and unit range. Using (|13.3.9p we have R{z) = z. Inserting this to 
(|13.3.7p we obtain an equation for the resolvent G{z) = z — 1/G(z) which gives 
in the upper complex half-plane G(z) = (z — V z 2 — 4)/2. Finally using (|13.3.8p 
we find a Wigner law p(X) = s/A — \ 2 /2ir with a = 1. We see that the Wigner 
law is equivalent in free probability to the normal law in classical probability. 
Secondly consider the case for a = 1, (3 = 0. Proceeding in the same way as 
above we find respectively 

R{z) = -i , G{z) = , p{\) = -— ^ . (13.3.10) 

Z + l 7T 1 + A z 



5 We give only the standardized version which corresponds to the unit range. An R- 
transform with a range r can be obtained from the standardized one by a rescaling R r (z) = 
rR(rz). 
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This case is special because the stable density is identical in classical and free 
probability. For other values of a one can find the free stable laws by applying 
the equation (|13.3.7|) to the i?-transform (|13.3.9p which gives the following 
equation for the resolvent 

bG a (z)- zG(z) + 1 = . (13.3.11) 

This equation can be solved analytically for a couple of values of the parameter 
a for which it is just a quadratic, cubic or quartic equation. The solution can 
be then used to calculate the eigenvalue density (|13.3.8j) . For other values the 
eigenvalue density can be determined numerically. 

The equation (|13,3.11j) may be used to extract the asymptotic behavior of 
the corresponding eigenvalue density () 13.3.8H . We will give the result only for 
the symmetric case {(3 = 0) and for the range R = 1. For small eigenvalues 
|A| -► it reads [Bur02j 

"W = K 1_ ^ iA2+ - ") (13 ' 312) 

while for large ones, |A| — > oo 

^-Ism^lAp- 1 . (13.3.13) 

The distribution has a smooth quadratic maximum at A = while for large A 
it has heavy power-law tails with the same power as the corresponding stable 
law for real variables. For a = 2 the tails disappear. 

The procedure described above to derive the free probability density p(X) 
of free stable laws is solely based on relations between the i?-transform, the 
resolvent and the density. It does not involve any matrix calculations. Free 
random matrices appear as a representation of these free random variables and 
correspond to infinitely large random matrices with a given eigenvalue density 
p(A) and with a rotationally invariant probability measure. Actually there are 
many different matrix realizations of the same free random variable. We shall 
below discuss two simplest ones which have completely different microscopic 
properties. The most natural realization is a matrix generated by the uniform 
angular randomization ODO T of a large diagonal matrix D of size N — ► oo 
which has i.i.d. random variables on the diagonal. If we choose the p.d.f. of the 
diagonal elements as p(X), which corresponds to a free stable law, we obtain 
a matrix realization of a free random variable which is stable under addition. 
Clearly, the eigenvalues of this matrix are by construction uncorrelated. We 
can now independently generate many such matrices OiDiOf, i = 1,...,K. 
Due to the stability also the following sum 

1 K 

A ^ = ^J2°^°i ( 13 - 3 - 14 ) 

i=l 
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Figure 13.2: Level spacing histograms of the matrices A m x (I13.3.14P of size 
200 x 200 obtained from diagonal matrices D{ whose eigenvalues were gen- 
erated independently from a Wigner semicircle. For K = 1 the histogram 
properly reflects the Poissonian nature of eigenvalues of a single matrix D{. For 
K > 1 (already for K = 2) the histogram has a shape of the Wigner surmise 
characteristic for a typical eigenvalue repulsion. 

has exactly the same eigenvalue density p{X) as each term in the sum, so it is a 
representation of the same free random variable. However the sum A a x belongs 
to a different microscopic class since its eigenvalues repel each other contrary to 
eigenvalues of each matrix in the sum which are uncorrelated by construction. 
Already for K = 2 one observes a standard repulsion characteristic for invariant 
ensembles as illustrated in Figure 113.21 It is tempting to conjecture that for 
K — > oo the sum A b k becomes a random matrix which maximizes entropy for 
the stable eigenvalue density p(X). The probability measure for such matrices 
is known to be rotationally invariant dp(A) = DAexp — NtrV(A) and to have 
a potential V(A) which is related to p(X) as [Bal68 



For free stable laws the resolvent G(X) is given by a solution of (|13.3. 1 1 [) . In 
particular, for a = 2 the last equation gives a quadratic potential V(A) = A 2 /2, 
while for a = 1 and (3 = a logarithmic one V(A) = ln(A 2 + 1). The potential 
can also be determined for other values of a. Generally, for large |A| — ► oo it 
behaves as 



For maximal entropy random matrices one can also find the joint probability 
which takes a standard form 




(13.3.15) 



V(A) = 2 In A - 2a" 1 Re (bX~ a ) + ... 



(13.3.16) 




(13.3.17) 



i<j 
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Figure 13.3: Eigenvalue density for Wigner-Levy matrices and corresponding 
free Levy matrices for a = 1.00 (left) and a = 1.50 (right) are represented by 
solid line. The corresponding Monte-Carlo histogram for a free sum (|13.3. 18|) 
of K = 32 Wigner-Levy matrices of size iV = 200 is shown. For comparison we 
chose the range of the Wigner-Levy matrix to be R = (r(l + a))~ l l a since then 
the asymptotic behavior of the eigenvalue density for Wigner-Levy matrices 
(|13.2.10p is identical as for the standardized free random variables (|13.3.13p . 

where C is a normalization and (3 as usual is equal 1 or 2 for orthogonal or 
unitary invariant ensemble respectively. For free stable laws the potential V{\) 
assumes however a highly non-standard non-polynomial form [Bur02j. 

The stable laws in a free probability play a similar role as the correspond- 
ing stable laws in classical probability where a sum of i.i.d. centered random 
variables s n = (x± + . . . + x n )/n 1 / a is known to become an a-stable random 
variable for n — > oo. We expect a similar effect for random matrices |Hia00| . 
For example, if one generates a sequence of K independent Wigner-Levy matri- 
ces (Ai, A2, ■ ■ ■ , Ak) and a sequence independent random orthogonal matrices 
(Oi, O2, ■ ■ ■ , Ok) and one forms a sum 

1 - 

A ^ = ^Y. ^ ( 13 - 3 - 18 ) 

i=l 

in analogy to (|13.3.14p one expects that for large K the sum will become a 
free random matrix with an eigenvalue density given by a free a-stable law 
[BurQ7j. Actually in practice the convergence to the limiting distribution is 
very fast. We observe that already for K or order 10 and N of order 100 the 
eigenvalue density of the matrix A m x does not significantly differ from the stable 
density p(X) for free random variables (see Figure [13. 3D . If we skipped random 
rotations in (|13.3. 18j) and just added many Wigner-Levy matrices {A\ + . . . + 
Ax)/K l / a we would obtain a Wigner-Levy matrix. As we mentioned already, 
the independence of the matrices Ai itself is not sufficient to make the addition 
free and only the rotational randomization brings the sum to the universality 
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class of free random variables. The comparison of the eigenvalue density of the 
Wigner-Levy and the corresponding free Levy matrices is shown in Figure [13.31 



13.4 Heavy tailed deformations 

In this section we discuss ensembles of random matrices obtained from the 
standard ensembles by a reweighting of the probability measure. We will begin 
by sketching the idea for real random variables where the procedure is simple 
and well known and then we will generalize it to random matrices. We will in 
particular concentrate on heavy-tailed deformations of the probability measure 
of the Wishart ensemble which are applicable to the statistical multivariate 
analysis of heavy tailed data. 

Consider a random variable x constructed as a product x = <r£ where £ is a 
normally distributed variable M(0, 1) and a £ (0, oo) is an independent random 
variable representing a fluctuating scale factor having a p.d.f. f(cr). One can 
think of Gaussian variable AA(0, a 2 ) for which the variance itself is a 

random variable. Obviously the p.d.f. of x can be calculated as 

r°° l 
p(x) = / daf(a)^=^e . (13.4.1) 



Choosing appropriately the frequency function f(cr) one can thus model the 
p.d.f. of x. For example for |Tos04l IBer04} IAbu05] 



1 2 / a 2 \i * 2 
f( a ) = -—r^ At e'** (13.4.2) 
M ' oT(f) \2a 2 J y ' 

where a is a constant, the integral (|13.4.ip takes the following form 

(13.4.3) 

In doing the integral we changed the integration variabl^l to ( = a 2 / (2a 2 ). For 
large \x\ the p.d.f. has a power-law tail p(x) ~ The variance of the 

distribution exists only for a > 2 and is equal to a 2 / (a — 2). 

If one applies this procedure to each matrix element independently Aij = 
j, one obtains a Wigner matrix discussed in the first section of this 
chapter. One can however construct a slightly different matrix A^(a) whose 
elements have the same common random scale factor A^ = a^ij and differ 
only by &j which are i.i.d. Gaussian random variables JV(0, 1). In the limit 
N — > oo the eigenvalue density of the matrix An (a) / '\fN converges to the 



Wigner semicircle law p CT (A) = \/4cr 2 — A 2 /(27r<T 2 ). The scale factor a changes 
6 The variable ( has a \ 2 distribution. 
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however from matrix to matrix with the frequency f(cr) so in analogy to (|13.4.ip 
the effective eigenvalue density in the ensemble of matrices is given by the 
average of the semicircle law over a [Boh08j 

//■OO -1 
daf{a)p a {\) = j x da/(<7)^-2V4a2-A2 . (13.4.4) 

In particular, for the frequency function (|13.4.2|) we obtain the following eigen- 
value density |Ber04j 

which has power-law tails, p(\) ~ |A| _a_1 , with the same power as the p.d.f. 
(|13.4.3p for matrix elements. One can generalize the result to other frequency 
functions |Mut05j . 

Actually exactly the same strategy can be applied to calculate the joint prob- 
ability since fluctuations of matrix elements are modified by a common scale 
factor p(Ai, . . . , Xn) = f daf(a)p a (Xi, . . . , \n), where p a is given by f|13.3. 17j) 
with V{\) = A 2 /(2cr 2 ). From the joint probability one can then derive micro- 
scopic properties of matrices, including the microscopic correlation functions 
and the distribution of the largest eigenvalue |Boh09] • 

One can use this procedure to deform probability measures of other ma- 
trix ensembles as well. In what follows we concentrate on deformations of the 
Wishart ensemble. The probability measure for a standardized Wishart ensem- 
ble of real matrices^ is given by 

4t*(0 = (2vr)~^ e -¥* ti T D £ (13A.6) 

where £ is a rectangular matrix i = 1,...,N, t = 1,...,T and D^ = 
nft^d^it- The eigenvalue density of the matrix (1/T)££ T is known to con- 
verge to the Marcenko-Pastur law /?*(A) = y (A + — A)(A — A_)/(27rrA), where 
X± = (1 ± r) 2 and r = N/T in the limit N — > oo, r = const [Mar67]. The 
elements of the matrix £ represent normally distributed fluctuations of uncor- 
rected random numbers with unit variance. Using now the reweighting method 
we can consider a matrix An = a£,u with a common fluctuating scale factor be- 
ing an independent random variable with a p.d.f. f(cr). The effective probability 
measure can be easily derived from (|13.4.6|) and reads 

dp(A) = DA I daf(a)a~ NT e-^ trAAT (13.4.7) 



For complex matrices the corresponding measure reads dn*(£) = Tv' NT e~ tl6i D£ where 
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where the factor a~ NT comes from the change of variables in the measure 
DA = a NT D^. In particular for the frequency function (|13.4.2|) this gives the 
measure of a multivariate Student's distribution^] 

F (a±NT) f tr AA T \~ 2± ^ L 

d « (A) = D \ a ^r(i) + — ) ' (13A8) 

The eigenvalue density of the matrix (1/T)XX T , where X is generated with 
the probability measure given above can be obtained by the same reweighting 
method as before. The eigenvalue density of the Gaussian part in (|13.4.7|) 
is p CT (A) = p*(\/a 2 )/(T 2 = ^(a 2 X + - A)(A - (7 2 A_)/(2^m 2 A). It has to be 
reweighted with the frequency function (|13.4.2 j) p(A) = J daf(a)p a (X). The 
result reads |Bur06j 

PW = M^T^ 2 ' 1 t + V(A+-C)(C-A_) e-^C^dC . (13.4.9) 



2^r(f)' 



The support of this eigenvalue distribution is infinite. The exponent a/2 of 
the tail is a half of the exponent a of p.d.f. for the matrix elements X as one 
can expect for a matrix (1/T)XX T which is a "square" of X. The reweight- 
ing can be applied to derive the corresponding joint probability function and 
to determine the microscopic correlations of the deformed Wishart ensembles 
|Ake08j . 

This result can be generalized in a couple of ways. One can change the 
frequency function f(cr) [Abu09] but one can also change the relation between 
the A and £ matrix from An = cr!;it to, for instance, An = cr SiOij^jt where 
Oij is a rotation matrix and Si is a vector of positive numbers. The matrix O 
and the vector S are fixed in this construction and the only fluctuating elements 
are which are i.i.d. M(0, 1) and a which is an independent random variable 
with a given p.d.f. /(c), as before. The interpretation of the construction 
is clear. The factors S^s change the scale of fluctuations and the matrix O 
rotates the main axes. If one applies it to (|13.4.6p one will obtain a deformed 
measure (|13.4.8p where trAA T will be substituted by trAC _1 A T . The matrix 
Cij = OikS^Okj introduces explicit correlations between the degrees of 
freedom. In a similar way one can introduce correlations Cu 1 between An and 
A it i at different times t and t' [Bur06] . 

Another interesting generalization of the reweighting procedure is to con- 
sider a matrix An = (7iS,u where now the scale factors Oi are independent 
random variables for each row |Bir07b] . This case corresponds to a Wishart 
ensemble with a fluctuating covariance matrix Cij = 5ija 2 where <7j are i.i.d. 



An analogous expression [Tos04l lBer04] can be obtained for GOE and GUE Wigner ma- 
trices reweighted with the frequency function (113. 4. 2[) . 
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random variables with a given p.d.f. f(cr). This problem can be solved analyti- 
cally thanks to an explicit relation between the Greens functions and eigenvalue 
densities of the matrices C and (1/T)XX T |Bur04| . 

The idea of reweighting is quite general. So far we have described reweight- 
ing through the scale parameter a but one can use other quantities as a basis for 
the reweighting scheme as well. For example one can use the trace t = trXX T 
of the whole matrix. In this scheme the idea is to calculate quantities for the 
ensemble with a probability measur (H dyniX) = DX5(t - tiXX T ) and then to 
reweight them using a frequency function g(t) to obtain the corresponding val- 
ues for the ensemble with the measure dn(X) = DXg(tvXX T ) [Ake99llAbuTi5] . 
It turns out that the first step, that is the calculations for the fixed trace en- 
semble, can be done analytically so also this procedure gives a practical recipe 
to handle "non-standard" ensembles. Of course it works only for ensembles for 
which the measure depends on trXX T as for instance (|13.4.8j) . In particular it 
can be applied to the multivariate Wishart-Student ensembles |Bur06| . 

13.5 Summary 

Heavy-tailed random matrices is a relatively new branch of random matrix 
theory. In this chapter we discussed several matrix models belonging to this 
class and presented methods of integrating them. We believe that the models, 
methods and concepts can be applied to many statistical problems where non- 
Gaussian effects play an important role. 
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